AITopics | subspace distance

Collaborating Authors

subspace distance

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Collaborative and Efficient Fine-tuning: Leveraging Task Similarity

Magakyan, Gagik, Reisizadeh, Amirhossein, Park, Chanwoo, Parrilo, Pablo A., Ozdaglar, Asuman

arXiv.org Machine LearningFeb-10-2026

Adaptability has been regarded as a central feature in the foundation models, enabling them to effectively acclimate to unseen downstream tasks. Parameter-efficient fine-tuning methods such as celebrated LoRA facilitate efficient adaptation of large foundation models using labeled, high-quality and generally scarce task data. To mitigate data scarcity in fine-tuning of foundation models, we propose to leverage task similarity across multiple downstream users. Intuitively, users with similar tasks must be able to assist each other in boosting the effective fine-tuning data size. We propose Collaborative Low-Rank Adaptation, or CoLoRA, which exploits task similarity to collaboratively and efficiently fine-tune personalized foundation models. The main idea in CoLoRA is to train one shared adapter capturing underlying task similarities across all tasks, and personalized adapters tailored to user-specific tasks. We theoretically study CoLoRA on heterogeneous linear regression and provide provable guarantees for ground truth recovery. We also conduct several natural language experiments with varying task similarity, which further demonstrate that when trained together with similar tasks, individual performances are significantly boosted.

dist, machine learning, natural language, (16 more...)

arXiv.org Machine Learning

2602.07218

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)
Asia > India > Tripura (0.04)

Genre: Research Report (0.81)

Industry: Education (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.45)

Add feedback

On The Concurrence of Layer-wise Preconditioning Methods and Provable Feature Learning

Zhang, Thomas T., Moniri, Behrad, Nagwekar, Ansh, Rahman, Faraz, Xue, Anton, Hassani, Hamed, Matni, Nikolai

arXiv.org Machine LearningFeb-3-2025

Layer-wise preconditioning methods are a family of memory-efficient optimization algorithms that introduce preconditioners per axis of each layer's weight tensors. These methods have seen a recent resurgence, demonstrating impressive performance relative to entry-wise ("diagonal") preconditioning methods such as Adam(W) on a wide range of neural network optimization tasks. Complementary to their practical performance, we demonstrate that layer-wise preconditioning methods are provably necessary from a statistical perspective. To showcase this, we consider two prototypical models, linear representation learning and single-index learning, which are widely used to study how typical algorithms efficiently learn useful features to enable generalization. In these problems, we show SGD is a suboptimal feature learner when extending beyond ideal isotropic inputs $\mathbf{x} \sim \mathsf{N}(\mathbf{0}, \mathbf{I})$ and well-conditioned settings typically assumed in prior work. We demonstrate theoretically and numerically that this suboptimality is fundamental, and that layer-wise preconditioning emerges naturally as the solution. We further show that standard tools like Adam preconditioning and batch-norm only mildly mitigate these issues, supporting the unique benefits of layer-wise preconditioning.

artificial intelligence, international conference, machine learning, (15 more...)

arXiv.org Machine Learning

2502.01763

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)
Africa > Middle East > Tunisia > Ben Arous Governorate > Ben Arous (0.04)
North America > United States > Pennsylvania (0.04)

Genre: Research Report (0.81)

Industry: Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

ShifCon: Enhancing Non-Dominant Language Capabilities with a Shift-based Contrastive Framework

Zhang, Hengyuan, Shang, Chenming, Wang, Sizhe, Zhang, Dongdong, Yao, Feng, Sun, Renliang, Yu, Yiyao, Yang, Yujiu, Wei, Furu

arXiv.org Artificial IntelligenceDec-11-2024

Although fine-tuning Large Language Models (LLMs) with multilingual data can rapidly enhance the multilingual capabilities of LLMs, they still exhibit a performance gap between the dominant language (e.g., English) and non-dominant ones due to the imbalance of training data across languages. To further enhance the performance of non-dominant languages, we propose ShifCon, a Shift-based Contrastive framework that aligns the internal forward process of other languages toward that of the dominant one. Specifically, it shifts the representations of non-dominant languages into the dominant language subspace, allowing them to access relatively rich information encoded in the model parameters. The enriched representations are then shifted back into their original language subspace before generation. Moreover, we introduce a subspace distance metric to pinpoint the optimal layer area for shifting representations and employ multilingual contrastive learning to further enhance the alignment of representations within this area. Experiments demonstrate that our ShifCon framework significantly enhances the performance of non-dominant languages, particularly for low-resource ones. Further analysis offers extra insights to verify the effectiveness of ShifCon and propel future research

computational linguistic, non-dominant language, representation, (15 more...)

arXiv.org Artificial Intelligence

2410.19453

Country:

Asia > Myanmar > Tanintharyi Region > Dawei (0.05)
Asia > Thailand > Bangkok > Bangkok (0.04)
North America > Mexico > Mexico City > Mexico City (0.04)
(8 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

On the Adversarial Robustness of Subspace Learning

Li, Fuwei, Lai, Lifeng, Cui, Shuguang

arXiv.org Machine LearningAug-16-2019

In this paper, we study the adversarial robustness of subspace learning problems. Different from the assumptions made in existing work on robust subspace learning where data samples are contaminated by gross sparse outliers or small dense noises, we consider a more powerful adversary who can first observe the data matrix and then intentionally modify the whole data matrix. We first characterize the optimal rank-one attack strategy that maximizes the subspace distance between the subspace learned from the original data matrix and that learned from the modified data matrix. We then generalize the study to the scenario without the rank constraint and characterize the corresponding optimal attack strategy. Our analysis shows that the optimal strategies depend on the singular values of the original data matrix and the adversary's energy budget. Finally, we provide numerical experiments and practical applications to demonstrate the efficiency of the attack strategies.

artificial intelligence, machine learning, subspace distance, (18 more...)

arXiv.org Machine Learning

1908.0621

Country:

Europe (1.00)
North America > United States > California (0.28)

Genre: Research Report (0.82)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military (0.89)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Compressed Subspace Learning Based on Canonical Angle Preserving Property

Jiao, Yuchen, Li, Gen, Gu, Yuantao

arXiv.org Machine LearningJul-14-2019

A standard way to tackle the challenging task of learning from high-dimensional data is to exploit its underlying low-dimensional structure. Union of Subspaces (UoS) is a popular and powerful model to describe such structure which assumes that the data lies in the union of a collection of low-dimensional subspaces. Extracting useful information from UoS structure of data has become the task of the newly-emerged field of subspace learning. In this paper, we investigate how random projection, an efficient and commonly-used method for dimensionality reduction, distorts the UoS structure of data. Here the fine details of UoS structure are described in terms of canonical angles (also known as principal angles) between subspaces, which is a well-known characterization for relative subspace positions by a sequence of angles. It is proved that random projection with the so-called Johnson-Lindenstrauss (JL) property approximately preserves canonical angles between subspaces. As canonical angles completely determine the relative position of subspaces, our result indicates that random projection approximately preserves structure of a union of subspaces. Inspired by this result, we propose in this paper the framework of Compressed Subspace Learning (CSL), which enables to extract useful information from the UoS structure of data in a greatly reduced dimension and has the advantage of lower computational cost and memory requirements. We demonstrate the effectiveness of CSL in various subspace-related tasks such as subspace visualization, active subspace detection, and subspace clustering.

artificial intelligence, data mining, machine learning, (17 more...)

arXiv.org Machine Learning

1907.06166

Country:

Asia (0.46)
North America > United States (0.28)
Europe (0.28)

Genre: Research Report > New Finding (0.34)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.55)

Technology:

Information Technology > Data Science > Data Mining (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.88)

Add feedback